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Letter from the Editor 


Summer brings time to reflect and recharge. The Summer 2017 volume of AIR Professional Files 
presents four articles with intriguing ideas to consider as you plan for the next academic year. | 


Data governance is a pressing issue for many IR professionals, as sources of data proliferate and os 
Association for 


challenge our ability to control data integrity. In her article, Institutional Data Quality and the Institutional Research 
Data Integrity Team, McGuire synthesizes and interprets results from 172 respondents to an AIR- 
administered survey of postsecondary institutions on their data integrity efforts. She describes the current state of data governance 


and offers strategies to encourage institutional leaders to invest in data quality. 


Those of us who work in assessment often take it for granted that assessment results will be used for learning improvement. Fulcher, 
Smith, Sanchez, and Sanders challenge this assumption by analyzing information from program assessment reports at their own 
institution. Needle in a Haystack: Finding Learning Improvement in Assessment Reports uncovers many possible reasons for the gap 
between obtaining evidence of student learning and using that evidence for improvement. The authors suggest ways to promote 
learning improvement initiatives, and share a handy rubric for evaluating assessment progress. 


Institutional researchers are beset with requests to form peer groups, and it seems that no one is ever satisfied with the results. 

Two articles in this volume present very different methodologies for forming sets of comparison institutions. In her article, A Case 
Study to Examine Three Peer Grouping Methodologies, D‘Allegro compares peer sets generated by different selection indices. She 
offers guidance for applying each index and encourages cautious interpretation of results. Rather than rummaging around for 

the perfect peer set, Chatman proposes creating a clone, or doppelganger university, one that is constructed from disaggregated 
components drawn from diverse data sources. In Constructing a Peer Institution: A New Peer Methodology, he walks us through the 
process of creating peers for faculty salaries, instructional costs, and faculty productivity. While the constructed peer approach has its 
challenges, the appeal of achieving a perfect fit peer is undeniable. 


| hope your summer “reflection” inspires you to share your work with your IR colleagues through AIR Professional Files. 


Sincerely, 


Sharron L. Ronco 
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Abstract 

Higher education insiders trumpet 
the use of results for improvement 
as the most important part of the 
assessment cycle. Yet, at the same 


time, we acknowledge the rarity of 
improvement, especially at a program 
level. What are some reasons the 

most important phase of assessment 
occurs so infrequently? To seek 
answers, we investigated the “Use 

of Results” sections in 54 program- 
level assessment reports. In some 
respects, our findings were positive. On 
average, programs reported making 
approximately three curricular or 
pedagogical changes annually. A closer 


inspection, however, revealed concerns: 


(1) the curricular or pedagogical 
changes were not explicitly linked 

to learning outcomes, (2) programs 
rarely reported making changes that 
affect several classes, (3) many of 

the reported changes were unclear, 

(4) and few programs reassessed to 
determine if changes actually led to 
learning improvement. Our research 
concludes by providing suggestions 
for how programs can more effectively 
use results to inform changes, reassess 
students to determine if changes led to 
learning improvement, and report on 
improvement processes. 


INTRODUCTION 


For more than 30 years, higher 
education has refined assessment 
methodologies to meet accountability 
demands and demonstrate value 
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(Ewell, 2009). Yet, as Suskie (2010, 

para. 8) observed, “Today we seem 

to be devoting more time, money, 
thought, and effort to assessment 
than to helping faculty help students 
learn as effectively as possible.’ Other 
researchers have come to a similar 
realization: although most institutions 
systematically collect assessment data, 
few use the data to improve student 
learning (Banta & Blaich, 2011; Blaich & 
Wise, 2011). 


Why aren't assessment results used 
for learning improvement? There 

are several theories: It could be 

that institutions incorrectly assume 
that using results for improvement 
can emerge from only interesting 
research findings and well-crafted 
reports (Blaich & Wise, 2011). It could 
also be that inconsistent and vague 
communication surrounding the use 
of results for improvement confuses 
programs (Smith, Good, Sanchez, 

& Fulcher, 2015). Furthermore, 
accreditation requirements, rather than 
intrinsic interests, might be the main 
driver of assessment practices (Kuh 
& Ikenberry, 2009). Indeed, a myopic 
focus on assessment activities (e.g., 
identifying outcomes and gathering 
data) unintentionally neglects 

using results for student learning 
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Figure 1. Depiction of Current Study Within Jonson et als (2014) Heuristic Model of Influence 
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improvement (Kinzie, Hutchings, & 
Jankowski, 2015). 


Not using results to inform curricular 
and pedagogical changes remains a 
serious problem for higher education. 
To investigate the issue, we analyzed 
“Use of Results” sections in 54 
assessment reports. While the current 
study emphasizes learning outcomes 
assessment at the academic degree 
program (e.g., bachelor’s degree in 
biology), many concerns and findings 
likely generalize to other assessment 
and institutional effectiveness 
initiatives. Indeed, the inability to use 
results to make changes that promote 
improvement is an institutional concern. 
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Conceptualizing Use of 
Assessment Results 

We are not the first assessment 
practitioners to examine why using 
results to improve student learning 
remains uncommon. For example, 
Jonson, Guetterman, and Thompson 
(2014) believed that higher education 
could benefit from a new, broader 
definition of use of results. 


Instead of focusing on curricular and 
pedagogical changes intended to 
improve student learning, Jonson et 
al. (2014) created a model to describe 
various ways that discussing results 
can positively influence the culture of 
a university (Figure 1). For example, 
using assessment results for discussion 


can support taking direct action on 
educational practice or policy or 
changing people's ways of thinking 
about learning and assessment. Results 
for discussion can also alter people's 
emotions or attitudes regarding 
assessment practice and affirm the 
efficacy of an existing practice. 


Jonson and colleagues (2014) 
further explained that each of the 
aforementioned influences could lead 
to the following outcomes: 
Evidence of improved student 
learning 
Transformation of stakeholders 
Building new communities of 
practice 
Generating support for policies and 
practice 


The Jonson et al. (2014) framework related to educational practice (i.e., us to keep in mind the overall intention 


sparks important conversations making changes to curriculum and/ of assessment and higher education. 
about how to define and measure or pedagogy) that lead to evidence The current study is situated in the 
using results for improvement, but of improved student learning (i.e., more-narrow definition, which might 
we believe that a narrower, student- students’ assessment scores show explain why we found so few examples 
focused approach to using results improvement after experiencing of using assessment results in the 54 
would be of greater benefit to higher modified curriculum or pedagogy). reports we examined. 

education. We define using assessment 

results for improvement as collecting Adopting the more-narrow definition 

and analyzing student learning data of use of results, one that centers on 

to support taking direct actions student learning improvement, allows 


Box 1. Hypothetical Example: 1980s Pop Culture Degree Program 


At the conclusion of the 1980s Pop Culture degree program, students must be able to properly cite and reference a variety 
of sources in a research paper. In 2014-2015 the program used a rubric to evaluate all students’ final research papers. 
Rubric scores revealed that students were not successful at citing or referencing sources. During a departmental discussion, 
program faculty confirmed that many students struggle to properly cite and reference sources. 


After agreeing that the learning outcome of properly citing sources was both relevant and unmet, faculty agreed on 
curricular and pedagogical changes to address the issue. Before implementing new changes, faculty consulted with 
other instructors on campus and gathered information regarding what assignments could be effective at teaching sucha 
specific skill set. Changes to the core courses of the 1980s Pop Culture program began in the fall of 2015. Specifically, the 
instructors of the two classes where writing is heavily emphasized—PCUL401 (1980s Politics and Culture) and PCUL404 
(The 1980s and Today)—did the following: 


1. Participated in a faculty development workshop during which the instructors found and agreed on examples of 
students’ citing and referencing sources in their papers. Some examples were developing papers and others were 
advanced papers. 


. Shared the results of the past writing assessment with students, emphasizing that citing and referencing sources is a 
concern. 


. Provided modified examples of a developing and advanced paper to illustrate program expectations. 

. Created more in-class assignments to measure student progress, and encouraged students to rely on their own skills, 
instead of on online citation software, to create references. 

. Used the writing rubric to evaluate students’ essays throughout the semester instead of using the rubric solely for the 
final research paper. 


Results from curricular and pedagogical changes suggested that students’ ability to cite and reference sources, as measured 
by the writing rubric, improved over time. Specifically, seniors’ scores on the citing and sourcing element increased from 2.6 
(between developing and competent) in 2015, the year before the curricular and pedagogical changes were implemented, 
to 3.2 in 2016 and 3.4 (between competent and advanced) in 2017, the years after the changes were implemented. 
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Understanding the Use 

of Results for Learning 
Improvement in Assessment 
Reports 

Every year, academic programs at 
universities nationwide complete 
assessment reports that include a 
“Use of Results” section (Fulcher, 
Swain, & Orem, 2012). The current 
study examined the contents of 
these sections. More specifically, we 
investigated if changes to curricula 
or pedagogies were made based 
on assessment results and whether 
previous changes led to student 
learning improvement. 


To evaluate the degree to which 
assessment reports conveyed using 
results for improvement, we first 
identified several ideal features of 
the “Use of Results” section in the 
assessment reports: 


* Changes to curricula and 
pedagogies are made and reported. 

* Changes to curricula and 
pedagogies are matched with an 
intended student learning outcome 
(i.e., what students should know, 
think, or be able to do). 

* Changes to curricula and 
pedagogies are presented with a 
clear rationale (e.g., assessment 
data support changes). 

« Reassessments demonstrate 
learning improvement (i.e., 
changes are at the program level 
and are effective). 


To make the ideal assessment report 
more concrete, we provide an example 
from a hypothetical example: the 1980s 
Pop Culture degree program (Box 1). 
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The Current Research Study 
Understanding how assessment reports 
could ideally connect assessment 
results to learning improvement efforts 
via curricular and pedagogical changes 
is important. We provided one simple 
example of a hypothetical program 

in an effort to clarify what the “Use 

of Results” section could, and should, 
include. 


The current study focused on 

real programs attempting to use 
assessment results. We reviewed and 
qualitatively rated 54 program reports, 
comparing their features to our ideal 
assessment report. In doing so, we 
addressed five research questions 
(RQs). 


Research Questions 

RQ 1. How extensive in magnitude are 
the reported changes to curricula and 
pedagogies? 

As we have explored, institutions and 
academic degree programs can use 
assessment results in different ways. 
Some use the results to inform changes 
to assessment instrumentation, 

while others use results to influence 
curricular and pedagogical changes. 
For those who used assessment 

results to change program curricula 

or pedagogies, we wanted to gauge 
the magnitude of the changes made, 
as described in assessment reports. 
That is, we wanted to see if the change 
was a course-level or a program-level 
change. If more students experience 
new curricula and pedagogy, we 
would expect to see more learning 
improvement at the program level. 


We defined and evaluated magnitude 
of change in terms of minor, moderate, 
major, or extensive changes. An 
example of a change coded as minor 
in magnitude could include a new or 
modified course assignment based on 
previous assessment results. A change 
of moderate magnitude could be a 
new or modified unit or segment of 
the course curriculum. Major changes 
could entail a complete redesign of an 
entire course. Finally, extensive changes 
necessitate a restructuring of the 
curriculum or pedagogical approaches 
that involved several courses within a 
given academic program. 


Again, we thought that perhaps 
programmatic changes of greater 
magnitude would be more likely to 
yield improved student learning. If 
faculty members are reporting that 
they only implemented changes 

of minor to moderate magnitude, 
this could help explain why no 
demonstrable student learning 
improvement exists. That is, using 
results to initiate only a minor or 
moderate change to curriculum, such 
as changing an assignment or unit in 
one course, might not be enough to 
move the needle at the program level. 


RQ 2. To what extent are curricular 
and pedagogical changes linked to 
student learning outcomes? 

To successfully improve student 
learning in a demonstrable way, 
faculty should focus assessment, 
pedagogical, and curricular efforts 
around specific student learning 
outcome(s) (Fulcher, Good, Coleman, 
& Smith, 2014). Once the learning 
outcome is identified, it should be 


clear how curricular and pedagogical 
modifications would enhance 
students’ skills, knowledge, or abilities. 


We defined and evaluated the match 
between changes and student learning 
outcomes, differentiating among four 
levels of connection in the “Use of 
Results” sections we evaluated: 


1. It might be unclear how the 
change is linked to student 
learning. 

2. It might be that the change is 
linked to student learning in 
general, but not directly to a 
specific student learning outcome 
of the program. 

3. It might be that the change is 
linked to a specific, program 
learning outcome and yet lack 
specificity about why or how the 
change aligns with that particular 
learning outcome. 

4. It might be that the change is 
clearly linked to a specific learning 
outcome in such a way that 
improvement seems likely. 


Demonstrable program-level learning 
improvement can be achieved only 
through changes that match student 
learning outcomes. In other words, if 
we cannot determine what students 
should know, think, or be able to 

do as a result of the programmatic 
changes, how will we know if the 
changes were successful at improving 
student learning? Programs that can 
align changes with student learning 
outcomes in a clear and logical 

way should have greater success 
evidencing improvement. 


RQ 3. What is the rationale behind 
curricular and pedagogical changes? 
Often, there are numerous reasons 
that programs decide to implement 
changes to curricula or pedagogies; it 
is important to explain the rationale 
for making specific pedagogical 
and/or curricular changes (Fulcher 

et al., 2014). Ideally, the rationale 
provided in assessment reports is 

not only explicit, but also originates 
from different sources (e.g., direct 
assessment measures, accreditation 
recommendations, etc.). It is plausible 
that when changes lack robust 
supporting rationale, they are less 
likely to culminate in demonstrable 
student learning improvement. A lack 
of understanding or articulation of the 
rationale for curricular and pedagogical 
changes might contribute to why 
minimal learning improvements are 
found in assessment reports. 


We defined and evaluated the rationale 
for curricular and pedagogical changes 
provided in assessment reports 

based on explicitness and type. For 
explicitness, we coded the report 
rationales as either stated, but not 
explained or stated with an explicit 
rationale. For type, we determined 
whether the source that contributed 

to the rationale was a direct measure, 
an indirect measure, anecdotal (e.g., 
conversations), accreditation or annual 
program review recommendations, or 
realignment of instruction with changes 
in programmatic learning objectives. 


RQ 4. What is the typical stage of 
implementation for curricular and 
pedagogical changes? 

Curricular and pedagogical changes 
take time to implement. For 

instance, Fulcher and colleagues 
(2014) suggested that it could take 

3 to 5 years to make program-level 
adjustments and subsequently use 
assessment results to demonstrate 
improved student learning. In addition 
to time, change requires planning 
and foresight. In order to coordinate 
change efforts, programs should create 
an improvement timeline. Timelines 
articulate when baseline assessment 
data will be collected, when 
pedagogical or curricular changes will 
be implemented, and when students 
will be reassessed to determine 
whether their learning actually 
improved (Fulcher et al., 2014). 


It could be the case that programs 
conceptualize processes of curricular 
and pedagogical changes 1 year at a 
time—correlative of the assessment 
reporting cycle. We encourage 
programs to look beyond an annual 
cycle. Creating a 3- or 5-year plan and 
timeline might help motivate programs 
to use assessment results, make 
changes, and reassess students to 
demonstrate improved learning. 


For the current study, we defined and 
evaluated the stage of implementation 
of change in terms of five criteria. 
Change efforts could be in one of the 
following five stages: 
1. Planning (a program is currently 
planning changes); 
2. In process (a program is currently 
implementing changes; some 
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changes but not all have been 
made); 

3. Completed but have not yet 
reassessed; 

4. for efficacy (or effectiveness 
reassessed) but no demonstrable 
improvement evidenced; or 

5. Completed and checked for 
efficacy (or effectiveness 
reassessed) and demonstrable 
improvement evidenced. 


RQ 5. To what degree are programs 
able to close the assessment loop by 
using results to inform changes and 
subsequently demonstrate improved 
student learning? 

The promise of quality assessment 
practice is to enhance learning 

for students and improve higher 
education. That is, if programs 

are typically unable to close the 
assessment loop by using results to 
inform changes and demonstrate 
learning improvement, then 
assessment practice is falling short of 
its promise. 


We addressed RQ 5 via the fifth stage 
of implementation criteria discussed 
previously for RQ 4. More specifically, 
change efforts coded as being at Stage 
5 of implementation represented 
instances of closing the assessment 
loop (i.e., change efforts coded as 
“Stage 5: Completed and checked for 
efficacy (or effectiveness reassessed) 
and demonstrable improvement 


evidenced” were used to address RQ 5). 


METHOD 


Our home institution is a mid-sized, 
4-year, public university in Virginia. 
The State Council of Higher Education 
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for Virginia (SCHEV) and our regional 
accreditor (Commission on Colleges of 
the Southern Association of Colleges 
and Schools, or SACSCOC) require 
colleges and universities to assess 
student learning. In compliance with 
their respective policies and guidelines, 
all academic degree-granting 
programs at our institution submit 
annual assessment reports for student 
learning outcomes. Each year graduate 
students, faculty members, and 
assessment specialists evaluate these 
assessment reports. Through feedback 
and consultation, several programs 

at our institution have demonstrated 
better assessment processes (Rodgers, 
Grays, Fulcher, & Jurich, 2013). 


For this study, we examined all 

54 exemplary assessment reports 
collected from the fall 2012-2013 
reporting cycle. Fifty-four represents 
approximately half of our academic 
degree and certificate programs. 
Exemplary assessment reports received 
a score of 3.4 or higher out of 4, ona 
meta-assessment rubric (see Appendix 
A) (Fulcher & Bashkov, 2012; Fulcher 

& Orem, 2010). The 3.4 standard was 
set in 2011 by trained faculty using a 
modified Angoff procedure. 


Our review included only exemplary 
assessment reports for practical 
reasons; we hypothesized that 
academic programs with established, 
high-quality assessment processes 
might be best poised to use 
assessment results to influence 
pedagogical and curricular changes 
(and subsequently demonstrate 
learning improvement). They also 
might be better equipped to reassess 


students’ learning to determine if 
the implemented changes actually 
promoted learning improvement. 
Furthermore, programs in nascent 
stages of assessment (not close to 
exemplary) are likely focused on 
setting up assessment infrastructure. 
Such programs are typically 
establishing learning objectives, 
creating curriculum maps, and 
selecting assessment instruments. 
These programs, therefore, are less 
likely to have collected data and 
synthesized them into actionable 
results. Of course, use of results is a 
moot point to those programs that 
have not collected data. In essence, 
by focusing on exemplary reports 
we could rule out undeveloped 
assessment practices as an 
explanation for not using results to 
improve student learning. Within 
each exemplary assessment report, 
we identified specific descriptions 
of using results for improvement 
and then used an online Qualtrics 
survey to code each of the identified 
descriptions. 


Procedures for Identifying and 
Coding Descriptions of Results 
To locate specific descriptions of 
using results for improvement, a 
graduate student familiar with the 
meta-assessment rubric (see Appendix 
A) and the layout of assessment 
reports reviewed electronic copies 

of all 2012-2013 assessment reports 
in alphabetical order according to 
program name. The graduate student 
first read Section 6A, “Program 
Modification and Improvement 
Regarding Student Learning and 
Development,’ of the assessment 


report; this section asks program 
assessment practitioners to describe 
use of assessment results for student 
learning improvement. If there were 
no examples or evidence of use of 
results to improve student learning 

in Section 6A, the graduate student 
reviewed other sections in the reports. 
If the graduate student initially found 
no evidence of use of results, she set 
the report aside. Later, she rereviewed 
the report, reducing the chance of an 
overlooked example. 


For each assessment report, the 
graduate student identified up to four 
examples that described use of results 
by electronically highlighting sections 
of the report in yellow. Note, of the 

54 exemplary assessment reports, 
there were only two that had more 
than four examples. After the initial 
review and electronic highlighting, 
the graduate student randomized 

the order of the assessment reports 
and rereviewed them, converting the 
yellow highlighting to highlighting in 
red, yellow, green, or blue (i.e., example 
one was highlighted in red, example 
two in yellow, etc.). 


Once the graduate student had 
reviewed all 2012-2013 exemplary 
assessment reports and highlighted all 
identified descriptions of use of results 
for improvement, three authors of this 
paper—along with three other graduate 
students—independently evaluated and 
coded the using-results descriptions via 
an online Qualtrics survey. Specifically, 
raters reviewed all highlighted 
descriptions—each representing an 
individual “use of results’—in their 
assigned assessment reports. The raters 


evaluated the following aspects of the 
descriptions: 


« Magnitude of change, defined by 
extent or magnitude of changes 
made to pedagogy, curricula, and 
so on (minor: changes to a small 
class assignment in one class; 
moderate: change to a unit within 
a class; major: major overhaul of a 
class; extensive: numerous changes 
that affect several classes) 

« Extent to which faculty linked 
change to student learning 
objectives 

« Rationale for needing change 

- Reported stage of change 
implementation 


The six raters were paired into 

three groups of two; each group 

was assigned a subset of the 54 
exemplary assessment reports. 
Groups 1, 2, and 3 evaluated a total 
of 20, 21, and 13 different assessment 
reports, respectively. First, each rater 


independently coded the highlighted 
sections in every assigned assessment 
report, then each rater pair adjudicated 
to reach exact agreement on all 

coded sections. For instance, Raters 

1 and 2 were paired and assigned 20 
assessment reports to review; one of 
those assessment reports was from 

the Assessment & Measurement Ph.D. 
program. Each rater independently 
reviewed every highlighted description 
of using results within the Assessment 
& Measurement program assessment 
report. Then, using a Qualtrics survey, 
each rater coded the highlighted 
descriptions. Finally, they reviewed 
each other's ratings and adjudicated 
until they agreed on all ratings for the 
Assessment & Measurement report. 
Each rater pair repeated this process for 
every assigned assessment report. 


RESULTS 


Across the 54 assessment reports, 
we identified and evaluated 162 
different descriptions of using 


Figure 2. Distribution of Magnitude of Curricular and/or Pedagogical Changes 
Across All Coded Program Assessment Reports 


Frequency 


49 
34 33 
oe 19 
20 18 
9 
i =a 
0 


1-N/A 2-Unclear 3-Minor 4-Moderate 5-Major 
Magnitude of Change and/or Pedagogical Changes 


6-Extensive 
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Figure 3. Percent of Program Assessment Reports That Contained Different 
Numbers of Curricular and/or Pedagogical Changes Coded as Being Extensive 
in Magnitude (e.g., Percent of Program Reports That Had Either 0, 1, 2, 3, or 4 


Extensive Changes) 


report contained 3 "extensive" 
curricular and/or pedagogical changes, 


2% 


report contained , Lh 
"extensive" curricul 

and/or pedagogical 

changes, 11% 


assessment results to make curricular 
or pedagogical changes. On average, 
we identified three descriptions per 
assessment report (M = 3.00, SD = 1.13). 
Clearly, reporting assessment data 
spurs talk of change. Nevertheless, only 
8% of programs (among the 54 reports) 
could show that their pedagogical 

and curricular changes led to better 
learning outcomes. The following 
research questions (RQs) explore why 
so little learning improvement was 
reported despite so many changes 
within programs. 
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report contained 4 
"extensive" 
curricular and/or 
pedagogical 
anges, 0% 


RQ 1. How extensive in magnitude are 
the reported changes to curricula and 
pedagogies? 

Recall, researchers rated the magnitude 
of curricular and pedagogical 

changes described in programmatic 
assessment reports based on the 
reported magnitude of changes made 
to courses, curricula, pedagogies, and 
so on. For instance, a change that 
involved only minimal adjustments 

to one assignment in one course 
would be rated as minor. Such an 
adjustment would not be expected to 
have a demonstrable, positive effect 


on student learning, at the program 
or departmental level. Comparatively, 
a change that involved extensive 
modifications that affected multiple 
courses within the program or 
department would be expected to 
have a more demonstrable influence 
on program- or department-level 
student learning. 


The magnitude of curricular and/ 

or pedagogical changes was slightly 
negatively skewed (see Figure 2). 

In other words, the majority of the 
identified changes were coded as either 
moderate (a coded score of 4), major (a 
coded score of 5), or extensive (a coded 
score of 6) in magnitude. On average, 
the described changes were coded as 
moderate (M = 4.10, SD = 1.45). Nearly 
20 of the identified changes were 
coded as unclear because, although 
faculty described a change, they did 
not provide enough information about 
the change to accurately identify its 
magnitude. For assessment reports in 
which faculty said they made a change, 
but then included no description of 
the change whatsoever, researchers 
applied the code“N/A.” 


Within each program assessment 
report, 54% had zero curricular and/ 
or pedagogical changes coded as 
extensive in magnitude (see Figure 
3). About 33% of the 54 assessment 
reports had one change coded as 
extensive in magnitude, 11% had 
two such extensive changes, and 2% 
had three. In addition, none of the 
54 assessment reports contained 
four changes coded as extensive in 
magnitude. In essence, nearly half 
(46% of programs) reported the type of 


Figure 4. Frequency of Identified Curricular and/or Pedagogical Changes That 
Were Linked or Aligned to Student Learning Outcomes 


110 


Frequency 


Not Clear 


66 
40 
40 34 
30 2 
20 
es 
0 


Change was linked to learning +Change was linked to a SLO 
but not to specific SLO 


Change was clearly, 
specifically, explicitly linked to 
SLO 


but lacked specificity 


Extent to which Curricular/Pedagogical Changes Were Linked to Learning 


extensive pedagogical and curricular 
changes most often associated with 
learning improvement. However, these 
extensive changes equated to fewer 
examples of learning improvement 
than one might expect: only 8%. The 
results for RQs 2 to 4 provide more 
explanation to why these extensive 
changes led to so few examples of 
evidenced improvements. 


RQ 2. To what extent are curricular 
and pedagogical changes linked to 
student learning outcomes? 
Typically, curricular and pedagogical 
changes were linked to student 
learning generally (a coded score of 
2), but were not matched to a specific, 
program-level student learning 
outcome (M = 2.40, SD = 1.08). As 
Figure 4 shows, approximately 34 

out of the 162 identified curricular or 
pedagogical changes, or 21%, did not 
include enough details for raters to 


Outcome 


evaluate the alignment between the 
change and the program's student 
learning outcome(s). The lack of 
explicit alignment between changes 
and student learning outcomes might 
be contributing to the issue at hand: 
insufficient use of assessment results to 
evidence improved student learning. 


For many, the link between curricular 
and pedagogical changes and specific 
student learning objectives might be 
implicit. However, documenting the 
use of assessment results to influence 
pedagogical or curricular changes that 
lead to improved student learning 
requires explicit connections between 
implemented changes and student 
learning outcomes. It seems that 
assessment practitioners and support 
services need to better conceptualize 
and articulate the importance of 
matching changes to student learning 
outcomes. 


RQ 3. What is the rationale behind 
curricular and pedagogical changes? 
About 80% of the identified 
descriptions of curricular or 
pedagogical changes provided a 
rationale that conveyed the need 

for change. But just over 50% of 

the descriptions of curricular or 
pedagogical changes provided a 
rationale and mentioned the source 
that supported the rationale (i.e., direct 
assessment measures, accreditation 

or program review recommendations, 
etc.). In addition, about 19% of the 
identified descriptions of curricular 

or pedagogical changes provided no 
rationale. The most frequently provided 
rationale behind the described 
curricular or pedagogical changes 

was data from direct assessment 
measures. In contrast, few cited 
accreditation/program review as a 
rationale for a given change; none 
mentioned curriculum realignment. 
Of the program assessment reports 
that provided a source explaining their 
intended curricular and/or pedagogical 
change, Figure 5 displays the percent 
of reports that cited various sources of 
rationales for changes. 

Perhaps programs recognize the 
results of direct assessment measures, 
instead of feedback from accreditation/ 
program reviews, as potential sources 
for change. In addition, some did 

not include any rationale to support 
changes to pedagogies and/or 
curricula. Perhaps the importance of 
understanding and describing the 
driving forces behind program-level 
changes is not recognized. Or, what 
might be a supportive rationale is not 
included because the report writer(s) 
believed the rationale was implied. 
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Figure 5. Of the Program Assessment Reports That Provided a Rationale and 
Source Explaining Their Intended Curricular and/or Pedagogical Change, Percent 
of Reports That Cited Various Sources of Rationales for Changes 


Re-alignment with 
outcomes, 0% 


Accreditation/Program 
Review 
Recommendations, 
5% 


Figure 6. Distribution of Stage of Change Implementation Ratings Across All 
Coded Program Assessment Reports 
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Note, the meta-assessment rubric used 
at our institution in 2012-2013, the 
year of these reports (see Appendix A), 
does not require an explicit rationale 
to support curricular or pedagogical 
changes. Nonetheless, explicitly 
describing the rationale underlying 
change is an essential part of using 
results to demonstrably improve 
student learning (Fulcher et al., 2014). 
Given assessment measures were the 
most frequently cited rationale for 
curricular and pedagogical changes, 
intrinsic buy-in for change might be 
nonexistent. Alternatively, curricular 
and pedagogical changes that lack 
adequate rationale might not be well 
aligned with students’ learning needs, 
program resources, faculty sentiments, 
or administrative agendas. 


RQ 4. What is the typical stage of 
implementation for curricular and 
pedagogical changes? 
Encouragingly, about 56% (85 out 

of 153) of the described curricular 

and pedagogical changes were 
complete. Yet, only 14% (21 out of 
153) of all described curricular and 
pedagogical changes included follow- 
up reassessments (see Figure 6). Again, 
in 2012-2013 the crucial reassessment 
phase had not been explicitly stated 
in our institutional assessment 

cycle nor in our meta-assessment 
rubric (see Appendix A). Therefore, 
programs might not have been aware 
of the importance of reassessing. 
Alternatively, many might mistakenly 
believe that assessment work is done 
as soon as data are used for curricular 
and pedagogical change. 


As Fulcher and colleagues (2014) 
explain, assessment practitioners, 
faculty members, and other 
stakeholders often confuse program 
changes with program improvements. 
A change is only an improvement 
when, upon reassessment, students 
demonstrate greater proficiency. 
Essentially, merely implementing 
curricular or pedagogical changes 
does not provide demonstrable proof 
of improved student learning, just as 
a pig never fattened because it was 
weighed. Assessment practitioners 
can do a better job of articulating 

and promoting the use of assessment 
results for improved student learning. 


RQ 5. To what degree are programs 
able to close the assessment loop by 
using results to inform changes and 
subsequently demonstrate improved 
student learning? 


As foreshadowed at the beginning of 
this section, only 8% of the evaluated 
curricular and pedagogical changes 
were implemented, reassessed, and 
demonstrated improved student 
learning. Our interpretation of this 
finding is that either programs 

are not closing the loop or our 
university programs do not know 
how to articulate such a process in an 
assessment report. Little integration of 
assessment processes with pedagogy 
and curricular design suggests a lack 
of clarity about learning improvement 
within our institution. 


CONCLUSION 


Even after more than 25 years of 
assessment practice at our university, 
finding evidence of student learning 


improvement in assessment reports 

is akin to finding aneedle ina 
haystack. To understand more about 
this most important phase of the 
assessment cycle, we qualitatively 
reviewed and coded 54 exemplary 
assessment reports from academic 
programs across our campus. In these 
assessment reports, writers described 
changes to course scaffolding, use of 
different classroom pedagogies, course 
redesigns, and so on. Furthermore, the 
curricular and pedagogical changes 
described were typically coded as 
being moderate in magnitude and 
were primarily driven by data from 
direct assessment measures. 


However, under scrutiny, the thread 
from the “Use of Results” section to 
demonstrable student learning was 
typically thin and loose. Few programs 
could demonstrate the positive impacts 
of the curricular and pedagogical 
changes they made. Based on 
descriptions in the assessment reports, 
programs rarely conducted follow-up 
reassessment research to determine 
whether curricular and pedagogical 
changes had a demonstrable impact 
on student learning outcomes. Perhaps 
this finding can help explain why use of 
assessment results has not contributed 
enough to improving student learning 
outcomes in higher education (Kuh, 
Jankowski, Ikenberry, & Kinzie, 2014). 


The inability to empirically demonstrate 
improved student learning was not 

for lack of earnest efforts to improve. 
That is, some programs conceptualized 
curricular and pedagogical changes, 
provided some rationale to support 
these changes, and implemented the 


changes in their entirety. Yet, many of 
the program assessment reports lacked 
one or more critical elements, including 


» Major or extensive pedagogical 
changes (i.e., changes at the 
program level); 

* Tenable links between curricular 
and pedagogical changes and 
student learning outcomes; 

* Convincing rationales to support 
curricular and pedagogical 
changes; and 

- Adequate reassessment processes 
that can determine whether 
changes actually improved student 
learning. 


Assessment Practitioners’ 

Role in Bridging the Gap 
between Using Results and 
Demonstrating Student 
Learning Improvement 

In general, higher education 
stakeholders have not successfully 
evidenced systematic improvements 
in student learning at the academic 
program level. While making some 
progress, our institution certainly 
struggles. From a policy perspective, 
being a good shepherd of resources 
suggests that institutions are 

making earnest efforts to improve. 
Academe’s lack of demonstrating such 
improvement definitely contributes to 
the “Is college worth it?” conversation 
(Taylor et al., 2011). 


To answer questions of worth and 
demonstrate the value of a college 
education, assessment results need to 
influence pedagogical and curricular 
changes at a program level. Ultimately, 
explicit gains in student learning 
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should be clearly articulated via 
assessment reports, presentations, 
and other channels of dissemination. 
Assessment practitioners must do 
more to communicate the importance 
of student learning improvement 
initiatives. 


Findings from the current study reflect 
Blaich and Wise’s (2011) observation 
that excellent assessment—by itself— 
does not lead to learning improvement. 
In addition, our results suggested that 
practitioners could increase student 
learning improvement by helping 
programs 


1. Develop and implement more 
widespread and multiyear 
curricular and pedagogical 
changes; 

2. Situate improvement efforts 
within student learning 
outcomes; 

3. Understand the important role of 
reassessment; and 

4. Use a framework or step-by-step 
example to more effectively 
report and explain crucial 
information. 


As Fulcher and Bashkov (2012) explain, 
we should not be surprised that 
assessment reports lack adequate 
descriptions of using results to 
demonstrate improved student 
learning. At our institution, we did 

not offer enough guidance with 
respect to how to report our learning 
improvement efforts. In addition, we 
realized that we have no assessment 
staff trained in pedagogy, curriculum, 
course redesign, course scaffolding, or 
organizational change. 


PAGE 30 | SUMMER 2017 VOLUME 


Lacking a holistic expertise within 
our own assessment office led us 

to engage in more-intentional 
partnership with our campus faculty 
development center. Doing so allows 
us to better serve faculty members as 
they create and implement curricular 
and pedagogical changes, and then 
reassess students’ learning. For 
instance, the faculty development 
experts assist programs as they 
articulate student learning outcomes 
and align them with program theory. 


We hope that the recommendations 
from the current study can assist 
institutions in better conceptualizing, 
articulating, implementing, reporting, 
and disseminating learning 
improvement success stories. We 
should also note that changes of 
greater magnitudes, alignment of 
actions, reassessing to determine effect 
of actions, and providing step-by- 

step examples for improvement can 

be extended beyond learning. The 
general principles could be applied to 
retention efforts, donor giving, or other 
important efforts. The more we discuss 
improvement, the better institutional 
decision makers we become. 


Study Limitations and Future 
Directions 

Thus far we have evaluated assessment 
reports from only one institution. 
Our findings might not reflect other 
institutions, especially those with 
different assessment practices and 
educational research initiatives. In 
addition, we have evaluated reports 
from only a single year’s reporting 
cycle. Replicating our study across 
various reporting cycles, and across 


institutions, would reveal potential 
longitudinal trends and could provide 
external validity evidence for our 
findings. 


In addition, future research should 
include interviews with faculty 
members who crafted the assessment 
reports. Through these qualitative 
data, institutional effectiveness 
researchers could further investigate 
faculty perceptions of the magnitude 
of their described changes to 
curriculum and pedagogy. A rigorous 
qualitative follow-up study could also 
provide crucial insights from faculty 
members to clarify why certain types 
of information and explanations were 
absent from the reviewed assessment 
reports. 
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Appendix A. Assessment Progress Template (APT) Evaluation Rubric as Described in Fulcher & Orem (2010) 


Assessment Progress Template (APT) Evaluation Rubric 


1. Student-centered learning objectives 


A. Clarity and Specificity 


No objectives stated. 


B. Orientation 


No objectives stated in 
student-centered terms. 


Objectives present, but with imprecise 
verbs (e.g., know, understand), vague 
description of content/skill/or attitudinal 
domain, and non-specificity of whom should 
be assessed (e.g., “students” 


Some objectives stated in student-centered Most objectives stated in student-centered terms. 
terms. 


2. Course/learning experiences that are mapped to objectives 


No activities/ courses Activities/courses listed but link to objectives Most objectives have classes and/or activities linked All objectives have classes and/or activities 
listed. is absent. to them. linked to them. 


3. Systematic method for evaluating progress on objectives 


A. Relationship between measures and objectives 


Seemingly no relation- 
ship between objectives 
and measures. 


B. Types of Measures 


No measures indicated 


At a superficial level, it appears the content 
assessed by the measures matches the objec- 
tives, but no explanation is provided. 


Most objectives assessed primarily via indirect 
(e.g., surveys) measures. 


C. Specification of desired results for objectives 


No a priori desired 
results for objectives 


Statement of desired result (e.g., student 
growth, comparison to previous year’s data, 
comparison to faculty standards, perfor- 
mance vs. a criterion), but no specificity (e.g., 
students will grow; students will perform better 
han last year) 


D. Data collection & Research design integrity 


No information is 
provided about data col- 
lection process or data 
not collected. 


E. Additional validity evi 


No additional psy- 
chometric properties 
provided. 


Limited information is provided about data 
collection such as who and how many took 
he assessment, but not enough to judge the 
veracity of the process (e.g., thirty-five seniors 
ook the test). 


dence 


Reliability estimates (e.g., internal consistency, 
test-retest, inter-rater) provided for most 
scores, although reliability tends to be poor 
(<.60). Or, author states how efforts have been 
made to improve reliability (e.g., raters were 
trained on rubric). 


4. Results of program assessment 


Appendix A continued on next page 
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Objectives generally contain precise verbs, rich descrip- 
tion of the content/skill/or attitudinal domain, and specifi- 
cation of whom should be assessed (e.g., “graduating 
seniors in the Biology B.A. program”) 


General detail about how objectives relate to measures 
is provided. For example, the faculty wrote items to 
match the objectives, or the instrument was selected 
“because its general description appeared to match our 
objectives.” 


Most objectives assessed primarily via direct measures. 


Desired result specified. (e.g., our students will gain 2 
standard deviation from junior to senior year; our stu- 

dents will score above a faculty-determined standard). 
“Gathering baseline data” is acceptable for this rating. 


Enough information is provided to understand the data 
collection process, such as a description of the sample, 
testing protocol, testing conditions, and student motiva- 
tion. Nevertheless, several methodological flaws are 
evident such as unrepresentative sampling, inappropri- 
ate testing conditions, one rater for ratings, or mismatch 
with specification of desired results. 


Reliability estimates provided for most scores, most 
scores are marginal or better (>.60). 


All objectives stated with clarity and specificity 
including precise verbs, rich description of 

the content/skill/or attitudinal domain, and 
specification of whom should be assessed 
(e.g., “graduating seniors in the Biology B.A. 
program” 


All objectives stated in student-centered terms 
(ie., what a student should know, think, or do). 


Detail is provided regarding objective-to- 
measure match. Specific items on the test are 
linked to objectives. The match is affirmed 

by faculty subject experts (e.g., through a 
backwards translation). 


All objectives assessed using at least one 
direct measure (e.g., tests, essays). 


Desired result specified and justified (e.g., Last 
year the typical student scored 20 points on 
measure x. The current cohort underwent more 
extensive coursework in the area, so we hope 
that the average student scores 22 points or 
better.) 


The data collection process is clearly explained 
and is appropriate to the specification of 
desired results (e.g., representative sampling, 
adequate motivation, two or more trained 
raters for performance assessment, pre-post 
design to measure gain, cutoff defended for 
performance vs. a criterion) 


Reliability estimates provided, most scores 
are marginal or better (>.60). Plus, other 
evidence given such as relationship of scores 
to other variables and how such relationship 
strengthens or weakens argument for validity 
of test scores. 


Appendix A continued 


A. Presentation of results 


No results presented Results are present, but it is unclear how they 


relate to the objectives or the desired results 
for the objectives. 


B. History of results 


No results presented Only current year’s results provided. 


C. Interpretation of Results 


No interpretation at- 
tempted 


Interpretation attempted, but the interpreta- 
tion does not refer back to the objectives 

or desired results of objectives. Or, the 
interpretations are clearly not supported by the 
methodology and/or results. 


No evidence of 
communication 


Information provided to limited number of 
faculty or communication process unclear. 


Results are present, and they directly relate to the 
objectives and the desired results for objectives but 
presentation is sloppy or difficult to follow. Statistical 
analysis may or may not be present. 


Past iteration(s) of results (e.g., last year’s) provided for 
some assessments in addition to current year’s. 


Interpretations of results seem to be reasonable 
inferences given the objectives, desired results of objec- 
tives, and methodology. 


Information provided to all faculty, mode and details of 
communication clear. 


Results are present, and they directly relate to 
objectives and the desired results for objec- 
tives, are clearly presented, and were derived 
by appropriate statistical analyses. 


Past iteration(s) of results (e.g., last year’s) 
provided for majority of assessments in addi- 
tion to current year’s. 


Interpretations of results seem to be reason- 
able given the objectives, desired results of 
objectives, and methodology. Plus, multiple 
faculty interpreted results (not just one person). 
And, interpretation includes how classes/ 
activities might have affected results. 


Information provided to all faculty, mode and 
details of communication clear. In addition, 
information shared with others such as 
advisory committees, other stakeholders, or to 
conference attendees. 


A. Improvement of programs regarding student learning and development 


No mention of any 
improvements. 


Examples of improvements documented but 
the link between them and the assessment 
findings is not clear. 


B. Improvement of assessment process 


No mention of how this 
iteration of assessment 
is improved from past 
administrations. 


Some critical evaluation of past and current 
assessment, including acknowledgement 

of flaws, but no evidence of improving upon 
past assessment or making plans to improve 
assessment in future iterations. 


Examples of improvements (or plans to improve) docu- 
mented and directly related to findings of assessment. 
However, the improvements lack specificity. 


Critical evaluation of past and current assessment, 
including acknowledgement of flaws; Plus evidence of 
some moderate revision, or general plans for improve- 
ment of assessment process. 


Examples of improvements (or plans to 
improve) documented and directly related to 
findings of assessment. These improvements 
are very specific (e.g., approximate dates of 
implementation and where in curriculum they 
will occur.) 


Critical evaluation of past and current assess- 
ment, including acknowledgement of flaws; 
both present improvements and intended 
improvements are provided; for both, specific 
details are given. Either present improvements 
or intended improvements must encompass a 
major revision. 
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